Overview

Dataset statistics

Number of variables15
Number of observations5530
Missing cells5933
Missing cells (%)7.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 MiB
Average record size in memory429.0 B

Variable types

NUM9
CAT6

Reproduction

Analysis started2022-03-03 16:50:43.132522
Analysis finished2022-03-03 16:51:05.089354
Duration21.96 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
CUST_ID has a high cardinality: 5530 distinct values High cardinality
CASH_ADVANCE has a high cardinality: 2609 distinct values High cardinality
PURCHASES_TRX has a high cardinality: 80 distinct values High cardinality
MINIMUM_PAYMENTS has a high cardinality: 5441 distinct values High cardinality
GENDER has 2714 (49.1%) missing values Missing
CASH_ADVANCE_TRX has 150 (2.7%) missing values Missing
ONEOFF_PURCHASES_FREQUENCY has 2740 (49.5%) missing values Missing
CASH_ADVANCE_FREQUENCY has 166 (3.0%) missing values Missing
TENURE has 163 (2.9%) missing values Missing
CUST_ID is uniformly distributed Uniform
CUST_ID has unique values Unique
PAYMENTS has unique values Unique
PURCHASES has 1393 (25.2%) zeros Zeros
CASH_ADVANCE_TRX has 2812 (50.8%) zeros Zeros
PURCHASES_FREQUENCY has 1392 (25.2%) zeros Zeros
ONEOFF_PURCHASES_FREQUENCY has 1464 (26.5%) zeros Zeros
CASH_ADVANCE_FREQUENCY has 2801 (50.7%) zeros Zeros

Variables

CUST_ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE
Distinct count5530
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size43.3 KiB
C12529
 
1
C14163
 
1
C13265
 
1
C16553
 
1
C16098
 
1
Other values (5525)
5525
ValueCountFrequency (%) 
C12529 1 < 0.1%
 
C14163 1 < 0.1%
 
C13265 1 < 0.1%
 
C16553 1 < 0.1%
 
C16098 1 < 0.1%
 
C17438 1 < 0.1%
 
C17125 1 < 0.1%
 
C11238 1 < 0.1%
 
C18933 1 < 0.1%
 
C12713 1 < 0.1%
 
Other values (5520) 5520 99.8%
 
2022-03-03T11:51:05.259728image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length6
Mean length6
Min length6
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Uppercase_Letter 1 9.1%
 
ValueCountFrequency (%) 
Common 10 90.9%
 
Latin 1 9.1%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

GENDER
Categorical

MISSING
Distinct count2
Unique (%)0.1%
Missing2714
Missing (%)49.1%
Memory size43.3 KiB
F
1443
M
1373
ValueCountFrequency (%) 
F 1443 26.1%
 
M 1373 24.8%
 
(Missing) 2714 49.1%
 
2022-03-03T11:51:05.480667image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length3
Mean length1.981555154
Min length1
ValueCountFrequency (%) 
Uppercase_Letter 2 50.0%
 
Lowercase_Letter 2 50.0%
 
ValueCountFrequency (%) 
Latin 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

BALANCE
Real number (ℝ)

Distinct count5525
Unique (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1041.700462707233
Minimum-4587.892398
Maximum7390.19856
Zeros6
Zeros (%)0.1%
Memory size43.3 KiB
2022-03-03T11:51:05.660952image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum-4587.892398
5-th percentile3.88240525
Q174.060304
median632.7436345
Q31545.808455
95-th percentile3869.371332
Maximum7390.19856
Range11978.09096
Interquartile range (IQR)1471.748151

Descriptive statistics

Standard deviation1353.093044
Coefficient of variation (CV)1.29892718
Kurtosis3.290218207
Mean1041.700463
Median Absolute Deviation (MAD)594.745598
Skewness1.475458824
Sum5760603.559
Variance1830860.785
2022-03-03T11:51:05.792357image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 6 0.1%
 
107.944741 1 < 0.1%
 
109.621031 1 < 0.1%
 
952.51575 1 < 0.1%
 
559.151424 1 < 0.1%
 
3356.816523 1 < 0.1%
 
4117.751094 1 < 0.1%
 
1679.952713 1 < 0.1%
 
1179.746682 1 < 0.1%
 
37.307085 1 < 0.1%
 
Other values (5515) 5515 99.7%
 
ValueCountFrequency (%) 
-4587.892398 1 < 0.1%
 
-4530.639094 1 < 0.1%
 
-4251.411617 1 < 0.1%
 
-4071.993764 1 < 0.1%
 
-3948.776884 1 < 0.1%
 
ValueCountFrequency (%) 
7390.19856 1 < 0.1%
 
7347.355967 1 < 0.1%
 
7293.108794 1 < 0.1%
 
7215.745096 1 < 0.1%
 
7152.864372 1 < 0.1%
 

PURCHASES
Real number (ℝ≥0)

ZEROS
Distinct count3682
Unique (%)66.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean534.5771030741411
Minimum0.0
Maximum9661.37
Zeros1393
Zeros (%)25.2%
Memory size43.3 KiB
2022-03-03T11:51:05.944812image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median269.13
Q3723.7
95-th percentile1975.906
Maximum9661.37
Range9661.37
Interquartile range (IQR)723.7

Descriptive statistics

Standard deviation773.4887449
Coefficient of variation (CV)1.446917087
Kurtosis18.5878817
Mean534.5771031
Median Absolute Deviation (MAD)269.13
Skewness3.268794177
Sum2956211.38
Variance598284.8385
2022-03-03T11:51:06.069845image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 1393 25.2%
 
45.65 21 0.4%
 
150 14 0.3%
 
60 12 0.2%
 
100 10 0.2%
 
450 10 0.2%
 
600 9 0.2%
 
50 9 0.2%
 
250 9 0.2%
 
120 9 0.2%
 
Other values (3672) 4034 72.9%
 
ValueCountFrequency (%) 
0 1393 25.2%
 
0.01 3 0.1%
 
0.05 1 < 0.1%
 
0.24 1 < 0.1%
 
1 2 < 0.1%
 
ValueCountFrequency (%) 
9661.37 1 < 0.1%
 
8945.67 1 < 0.1%
 
8834.96 1 < 0.1%
 
8591.31 1 < 0.1%
 
7311.99 1 < 0.1%
 

BALANCE_FREQUENCY
Real number (ℝ≥0)

Distinct count58
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.48255227016275
Minimum0.0
Maximum1000.0
Zeros6
Zeros (%)0.1%
Memory size43.3 KiB
2022-03-03T11:51:06.219724image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.363636
Q10.833333
median1
Q31
95-th percentile1
Maximum1000
Range1000
Interquartile range (IQR)0.166667

Descriptive statistics

Standard deviation152.899316
Coefficient of variation (CV)5.773586866
Kurtosis34.06665053
Mean26.48255227
Median Absolute Deviation (MAD)0
Skewness5.96293972
Sum146448.5141
Variance23378.20083
2022-03-03T11:51:06.506307image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 3554 64.3%
 
0.909091 275 5.0%
 
0.818182 188 3.4%
 
0.545455 158 2.9%
 
0.636364 147 2.7%
 
0.727273 145 2.6%
 
0.454545 135 2.4%
 
0.363636 125 2.3%
 
1000 111 2.0%
 
0.272727 110 2.0%
 
Other values (48) 582 10.5%
 
ValueCountFrequency (%) 
0 6 0.1%
 
0.090909 23 0.4%
 
0.1 1 < 0.1%
 
0.125 2 < 0.1%
 
0.142857 1 < 0.1%
 
ValueCountFrequency (%) 
1000 111 2.0%
 
909.091 9 0.2%
 
888.889 1 < 0.1%
 
857.143 2 < 0.1%
 
833.333 1 < 0.1%
 

CASH_ADVANCE
Categorical

HIGH CARDINALITY
Distinct count2609
Unique (%)47.2%
Missing0
Missing (%)0.0%
Memory size43.3 KiB
0.0
2808
??
 
75
0.0?ñ
 
41
472.818286
 
1
2436.195048
 
1
Other values (2604)
2604
ValueCountFrequency (%) 
0.0 2808 50.8%
 
?? 75 1.4%
 
0.0?ñ 41 0.7%
 
472.818286 1 < 0.1%
 
2436.195048 1 < 0.1%
 
1831.115496 1 < 0.1%
 
1957.772343 1 < 0.1%
 
1288.83283 1 < 0.1%
 
188.234434 1 < 0.1%
 
2908.400137 1 < 0.1%
 
Other values (2599) 2599 47.0%
 
2022-03-03T11:51:06.665165image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length13
Mean length6.456057866
Min length2
ValueCountFrequency (%) 
Decimal_Number 10 76.9%
 
Other_Punctuation 2 15.4%
 
Lowercase_Letter 1 7.7%
 
ValueCountFrequency (%) 
Common 12 92.3%
 
Latin 1 7.7%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 

CASH_ADVANCE_TRX
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count34
Unique (%)0.6%
Missing150
Missing (%)2.7%
Infinite0
Infinite (%)0.0%
Mean49.11542750929368
Minimum0.0
Maximum18000.0
Zeros2812
Zeros (%)50.8%
Memory size43.3 KiB
2022-03-03T11:51:06.793413image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile12
Maximum18000
Range18000
Interquartile range (IQR)3

Descriptive statistics

Standard deviation573.8177709
Coefficient of variation (CV)11.68304543
Kurtosis469.4166907
Mean49.11542751
Median Absolute Deviation (MAD)0
Skewness19.33841254
Sum264241
Variance329266.8342
2022-03-03T11:51:06.917999image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2812 50.8%
 
1 562 10.2%
 
2 393 7.1%
 
3 290 5.2%
 
4 234 4.2%
 
5 204 3.7%
 
6 159 2.9%
 
7 130 2.4%
 
8 105 1.9%
 
10 83 1.5%
 
Other values (24) 408 7.4%
 
(Missing) 150 2.7%
 
ValueCountFrequency (%) 
0 2812 50.8%
 
1 562 10.2%
 
2 393 7.1%
 
3 290 5.2%
 
4 234 4.2%
 
ValueCountFrequency (%) 
18000 1 < 0.1%
 
17000 1 < 0.1%
 
14000 1 < 0.1%
 
12000 1 < 0.1%
 
10000 1 < 0.1%
 

PURCHASES_FREQUENCY
Real number (ℝ≥0)

ZEROS
Distinct count69
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.206005977034355
Minimum0.0
Maximum1000.0
Zeros1392
Zeros (%)25.2%
Memory size43.3 KiB
2022-03-03T11:51:07.056756image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.363636
Q30.833333
95-th percentile1
Maximum1000
Range1000
Interquartile range (IQR)0.833333

Descriptive statistics

Standard deviation93.75767056
Coefficient of variation (CV)7.681273525
Kurtosis82.01112325
Mean12.20600598
Median Absolute Deviation (MAD)0.363636
Skewness8.892601479
Sum67499.21305
Variance8790.500789
2022-03-03T11:51:07.160301image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 1392 25.2%
 
1 881 15.9%
 
0.083333 465 8.4%
 
0.5 277 5.0%
 
0.166667 274 5.0%
 
0.25 237 4.3%
 
0.333333 233 4.2%
 
0.833333 230 4.2%
 
0.416667 216 3.9%
 
0.666667 211 3.8%
 
Other values (59) 1114 20.1%
 
ValueCountFrequency (%) 
0 1392 25.2%
 
0.083333 465 8.4%
 
0.090909 35 0.6%
 
0.1 18 0.3%
 
0.111111 12 0.2%
 
ValueCountFrequency (%) 
1000 26 0.5%
 
916.667 6 0.1%
 
900 1 < 0.1%
 
857.143 1 < 0.1%
 
833.333 4 0.1%
 

PURCHASES_TRX
Categorical

HIGH CARDINALITY
Distinct count80
Unique (%)1.4%
Missing0
Missing (%)0.0%
Memory size43.3 KiB
0
1353
1
 
460
12
 
387
2
 
252
6
 
243
Other values (75)
2835
ValueCountFrequency (%) 
0 1353 24.5%
 
1 460 8.3%
 
12 387 7.0%
 
2 252 4.6%
 
6 243 4.4%
 
4 203 3.7%
 
3 196 3.5%
 
5 190 3.4%
 
8 186 3.4%
 
7 184 3.3%
 
Other values (70) 1876 33.9%
 
2022-03-03T11:51:07.298631image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length7
Mean length1.45045208
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 83.3%
 
Other_Punctuation 1 8.3%
 
Lowercase_Letter 1 8.3%
 
ValueCountFrequency (%) 
Common 11 91.7%
 
Latin 1 8.3%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

ONEOFF_PURCHASES_FREQUENCY
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count41
Unique (%)1.5%
Missing2740
Missing (%)49.5%
Infinite0
Infinite (%)0.0%
Mean0.14829775232974912
Minimum0.0
Maximum1.0
Zeros1464
Zeros (%)26.5%
Memory size43.3 KiB
2022-03-03T11:51:07.417893image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.166667
95-th percentile0.75
Maximum1
Range1
Interquartile range (IQR)0.166667

Descriptive statistics

Standard deviation0.241687055
Coefficient of variation (CV)1.629741862
Kurtosis3.442475174
Mean0.1482977523
Median Absolute Deviation (MAD)0
Skewness2.013350785
Sum413.750729
Variance0.05841263257
2022-03-03T11:51:07.526542image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 1464 26.5%
 
0.083333 376 6.8%
 
0.166667 214 3.9%
 
0.25 131 2.4%
 
0.333333 88 1.6%
 
0.416667 83 1.5%
 
1 64 1.2%
 
0.5 61 1.1%
 
0.583333 42 0.8%
 
0.666667 38 0.7%
 
Other values (31) 229 4.1%
 
(Missing) 2740 49.5%
 
ValueCountFrequency (%) 
0 1464 26.5%
 
0.083333 376 6.8%
 
0.090909 23 0.4%
 
0.1 13 0.2%
 
0.111111 11 0.2%
 
ValueCountFrequency (%) 
1 64 1.2%
 
0.916667 28 0.5%
 
0.909091 1 < 0.1%
 
0.875 1 < 0.1%
 
0.833333 21 0.4%
 

CASH_ADVANCE_FREQUENCY
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count46
Unique (%)0.9%
Missing166
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean0.11900540920954511
Minimum0.0
Maximum1.5
Zeros2801
Zeros (%)50.7%
Memory size43.3 KiB
2022-03-03T11:51:07.638188image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.166667
95-th percentile0.5
Maximum1.5
Range1.5
Interquartile range (IQR)0.166667

Descriptive statistics

Standard deviation0.1732062886
Coefficient of variation (CV)1.455448872
Kurtosis3.499384508
Mean0.1190054092
Median Absolute Deviation (MAD)0
Skewness1.786846819
Sum638.345015
Variance0.03000041842
2022-03-03T11:51:07.764104image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2801 50.7%
 
0.083333 664 12.0%
 
0.166667 466 8.4%
 
0.25 360 6.5%
 
0.333333 258 4.7%
 
0.416667 155 2.8%
 
0.5 105 1.9%
 
0.583333 75 1.4%
 
0.666667 56 1.0%
 
0.090909 49 0.9%
 
Other values (36) 375 6.8%
 
(Missing) 166 3.0%
 
ValueCountFrequency (%) 
0 2801 50.7%
 
0.083333 664 12.0%
 
0.090909 49 0.9%
 
0.1 28 0.5%
 
0.111111 18 0.3%
 
ValueCountFrequency (%) 
1.5 1 < 0.1%
 
1.166667 1 < 0.1%
 
1 4 0.1%
 
0.916667 2 < 0.1%
 
0.9 1 < 0.1%
 

CREDIT_LIMIT
Real number (ℝ≥0)

Distinct count134
Unique (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3588.0952563609403
Minimum50.0
Maximum12500.0
Zeros0
Zeros (%)0.0%
Memory size43.3 KiB
2022-03-03T11:51:08.240180image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile1000
Q11500
median2900
Q35000
95-th percentile9000
Maximum12500
Range12450
Interquartile range (IQR)3500

Descriptive statistics

Standard deviation2640.396238
Coefficient of variation (CV)0.7358768509
Kurtosis0.5970263702
Mean3588.095256
Median Absolute Deviation (MAD)1500
Skewness1.145162447
Sum19842166.77
Variance6971692.293
2022-03-03T11:51:08.409619image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3000 563 10.2%
 
1500 542 9.8%
 
1200 457 8.3%
 
1000 454 8.2%
 
2500 426 7.7%
 
4000 317 5.7%
 
6000 281 5.1%
 
2000 280 5.1%
 
5000 225 4.1%
 
7000 147 2.7%
 
Other values (124) 1838 33.2%
 
ValueCountFrequency (%) 
50 1 < 0.1%
 
150 4 0.1%
 
200 3 0.1%
 
300 12 0.2%
 
400 2 < 0.1%
 
ValueCountFrequency (%) 
12500 12 0.2%
 
12000 31 0.6%
 
11500 25 0.5%
 
11000 25 0.5%
 
10750 1 < 0.1%
 

PAYMENTS
Real number (ℝ≥0)

UNIQUE
Distinct count5530
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1107.9898173103074
Minimum0.056466
Maximum9933.62261
Zeros0
Zeros (%)0.0%
Memory size43.3 KiB
2022-03-03T11:51:08.566564image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0.056466
5-th percentile124.9274707
Q1345.4311015
median671.0016995
Q31354.931507
95-th percentile3710.658747
Maximum9933.62261
Range9933.566144
Interquartile range (IQR)1009.500406

Descriptive statistics

Standard deviation1270.892564
Coefficient of variation (CV)1.147025491
Kurtosis9.951139009
Mean1107.989817
Median Absolute Deviation (MAD)399.8415645
Skewness2.78151989
Sum6127183.69
Variance1615167.91
2022-03-03T11:51:08.697010image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
192.781455 1 < 0.1%
 
628.237735 1 < 0.1%
 
661.24329 1 < 0.1%
 
876.63896 1 < 0.1%
 
523.910288 1 < 0.1%
 
264.032163 1 < 0.1%
 
284.093261 1 < 0.1%
 
1409.282903 1 < 0.1%
 
890.174668 1 < 0.1%
 
1072.433416 1 < 0.1%
 
Other values (5520) 5520 99.8%
 
ValueCountFrequency (%) 
0.056466 1 < 0.1%
 
3.500505 1 < 0.1%
 
4.523555 1 < 0.1%
 
4.841543 1 < 0.1%
 
9.533313 1 < 0.1%
 
ValueCountFrequency (%) 
9933.62261 1 < 0.1%
 
9858.055448 1 < 0.1%
 
9801.637331 1 < 0.1%
 
9724.871142 1 < 0.1%
 
9614.697558 1 < 0.1%
 

MINIMUM_PAYMENTS
Categorical

HIGH CARDINALITY
Distinct count5441
Unique (%)98.4%
Missing0
Missing (%)0.0%
Memory size43.3 KiB
??
 
89
299.351881
 
2
1311.061985
 
1
218.279194
 
1
596.541854
 
1
Other values (5436)
5436
ValueCountFrequency (%) 
?? 89 1.6%
 
299.351881 2 < 0.1%
 
1311.061985 1 < 0.1%
 
218.279194 1 < 0.1%
 
596.541854 1 < 0.1%
 
982.488109 1 < 0.1%
 
1315.479892 1 < 0.1%
 
351.744608 1 < 0.1%
 
92.369903?ñ 1 < 0.1%
 
233.788637?ñ 1 < 0.1%
 
Other values (5431) 5431 98.2%
 
2022-03-03T11:51:08.854746image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length13
Mean length9.779385172
Min length2
ValueCountFrequency (%) 
Decimal_Number 10 76.9%
 
Other_Punctuation 2 15.4%
 
Lowercase_Letter 1 7.7%
 
ValueCountFrequency (%) 
Common 12 92.3%
 
Latin 1 7.7%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 

TENURE
Categorical

MISSING
Distinct count19
Unique (%)0.4%
Missing163
Missing (%)2.9%
Memory size43.3 KiB
12
4226
11
 
224
10
 
149
6
 
135
7
 
125
Other values (14)
 
508
ValueCountFrequency (%) 
12 4226 76.4%
 
11 224 4.1%
 
10 149 2.7%
 
6 135 2.4%
 
7 125 2.3%
 
-12 124 2.2%
 
8 119 2.2%
 
9 108 2.0%
 
?? 69 1.2%
 
12?ñ 56 1.0%
 
Other values (9) 32 0.6%
 
(Missing) 163 2.9%
 
2022-03-03T11:51:09.010263image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Length

Max length4
Mean length1.988969259
Min length1
ValueCountFrequency (%) 
Decimal_Number 7 58.3%
 
Lowercase_Letter 3 25.0%
 
Dash_Punctuation 1 8.3%
 
Other_Punctuation 1 8.3%
 
ValueCountFrequency (%) 
Common 9 75.0%
 
Latin 3 25.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Interactions

2022-03-03T11:50:49.159765image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:49.455201image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:49.764227image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:49.969860image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:50.173757image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:50.417574image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:50.620630image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:50.819429image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.011533image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.171259image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.329837image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.487613image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.640000image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:51.807233image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.012171image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.190371image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.362628image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.534076image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.696454image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.841657image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:52.997367image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.157777image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.316674image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.449722image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.602153image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.783933image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:53.975588image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:54.229898image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:54.413642image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:54.581605image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:54.757223image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:54.940326image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.118186image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.266904image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.431445image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.621941image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.812762image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:55.989948image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.146560image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.296925image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.478805image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.640460image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.780955image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:56.976772image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.118662image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.246725image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.413732image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.558978image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.696934image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.840313image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:57.980666image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:58.120705image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:58.289075image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:58.465367image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:58.651457image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:58.843006image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:59.027085image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:59.200892image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:59.491415image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:59.678898image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:50:59.860277image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:00.080666image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:00.309039image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:00.539384image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:00.715458image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:00.870931image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:01.051063image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:01.247188image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:01.428191image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:01.615303image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:01.800641image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.020049image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.204676image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.362473image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.527701image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.690967image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:02.874137image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:03.073969image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:03.257106image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:03.402052image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:03.579976image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2022-03-03T11:51:09.153931image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-03T11:51:09.381028image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-03T11:51:09.574778image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-03T11:51:09.768395image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-03-03T11:51:09.955786image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-03-03T11:51:03.915425image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:04.370368image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:04.673182image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-03T11:51:04.849235image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Sample

First rows

CUST_IDGENDERBALANCEPURCHASESBALANCE_FREQUENCYCASH_ADVANCECASH_ADVANCE_TRXPURCHASES_FREQUENCYPURCHASES_TRXONEOFF_PURCHASES_FREQUENCYCASH_ADVANCE_FREQUENCYCREDIT_LIMITPAYMENTSMINIMUM_PAYMENTSTENURE
0C12529F107.944741118.160.875000472.8182861.00.12500020.1250.1250002500.0192.78145556.9996718
1C14138NaN241.0329790.001.000000642.8625051.00.0000000NaN0.0833331500.0915.454305195.16225612
2C15409NaN894.3578571164.001.0000000.00.01.00000012NaN0.0000002000.0907.603723270.413449-12
3C18141F-188.132508515.881.0000000.0NaN0.83333314NaN0.0000002700.0601.729266194.53493412
4C15879NaN3881.67958215.921.0000002183.7824569.00.0833331NaN0.3333335500.01032.1836321129.74722712
5C17660NaN1087.7846980.001.0000001562.7039532.00.00000000.0000.1666671500.03093.888643298.01196512
6C10916NaN1081.065726554.851.000000952.4249068.00.500000200.2500.1666672100.01898.828120382.71675112
7C15128NaN100.2083110.000.909091182.1439661.00.0000000NaN0.0909093000.0175.911508145.24418111
8C10109NaN862.0723800.001.000000920.3098051.00.00000000.0000.0833334000.02236.890255214.82815812
9C17983NaN1757.4399330.000.8333332408.0076016.00.00000000.0000.1666672500.0175.115831450.6167316

Last rows

CUST_IDGENDERBALANCEPURCHASESBALANCE_FREQUENCYCASH_ADVANCECASH_ADVANCE_TRXPURCHASES_FREQUENCYPURCHASES_TRXONEOFF_PURCHASES_FREQUENCYCASH_ADVANCE_FREQUENCYCREDIT_LIMITPAYMENTSMINIMUM_PAYMENTSTENURE
5520C16104NaN2525.6833440.001.000000285.1932045.00.0000000NaN0.1666677000.01483.384610702.05249112
5521C19019NaN634.5143540.000.9090911682.13742112.00.00000000.0000000.6363641500.02162.277429257.08164811
5522C18355NaN930.656420300.051.0000000.00.00.7500009NaN0.0000001200.0513.064156330.42281512
5523C18766NaN21.168201236.401.0000000.00.01.000000241.000000NaN2500.0217.008342178.16932112
5524C16616NaN846.0910112599.201.0000000.00.00.916667190.3333330.0000003000.01900.699307195.51606612
5525C10075NaN656.0130100.001000.0000001474.3499013.00.00000000.0000000.1250007000.0910.457985140.9831938
5526C17321NaN15.232505384.000.2727270.00.01.00000012?ñNaN0.0000001500.0568.98266454.44941612
5527C12909NaN1023.1247911537.931.000000247.041971.00.750000250.5833330.0833339000.01070.149971235.241959-12
5528C15615F957.010021604.801.000000901.7547093.01.00000012NaN0.0833331000.0811.457190926.08714812
5529C12391NaN2664.700424715.511.000000494.5736621.0750.000000110.0833330.0833333500.0918.003032792.90289412